weighted score
Soppia: A Structured Prompting Framework for the Proportional Assessment of Non-Pecuniary Damages in Personal Injury Cases
Applying complex legal rules characterized by multiple, heterogeneously weighted criteria presents a fundamental challenge in judicial decision-making, often hindering the consistent realization of legislative intent. This challenge is particularly evident in the quantification of non-pecuniary damages in personal injury cases. This paper introduces Soppia, a structured prompting framework designed to assist legal professionals in navigating this complexity. By leveraging advanced AI, the system ensures a comprehensive and balanced analysis of all stipulated criteria, fulfilling the legislator's intent that compensation be determined through a holistic assessment of each case. Using the twelve criteria for non-pecuniary damages established in the Brazilian CLT (Art. 223-G) as a case study, we demonstrate how Soppia (System for Ordered Proportional and Pondered Intelligent Assessment) operationalizes nuanced legal commands into a practical, replicable, and transparent methodology. The framework enhances consistency and predictability while providing a versatile and explainable tool adaptable across multi-criteria legal contexts, bridging normative interpretation and computational reasoning toward auditable legal AI.
Multimodal Benchmarking and Recommendation of Text-to-Image Generation Models
Wanaskar, Kapil, Jena, Gaytri, Eirinaki, Magdalini
This work presents an open-source unified benchmarking and evaluation framework for text-to-image generation models, with a particular focus on the impact of metadata augmented prompts. Leveraging the DeepFashion-MultiModal dataset, we assess generated outputs through a comprehensive set of quantitative metrics, including Weighted Score, CLIP (Contrastive Language Image Pre-training)-based similarity, LPIPS (Learned Perceptual Image Patch Similarity), FID (Frechet Inception Distance), and retrieval-based measures, as well as qualitative analysis. Our results demonstrate that structured metadata enrichments greatly enhance visual realism, semantic fidelity, and model robustness across diverse text-to-image architectures. While not a traditional recommender system, our framework enables task-specific recommendations for model selection and prompt design based on evaluation metrics.
Evidence of Cognitive Deficits andDevelopmental Advances in Generative AI: A Clock Drawing Test Analysis
Galatzer-Levy, Isaac R., McGiffin, Jed, Munday, David, Liu, Xin, Karmon, Danny, Labzovsky, Ilia, Moroshko, Rivka, Zait, Amir, McDuff, Daniel
Generative AI's rapid advancement sparks interest in its cognitive abilities, especially given its capacity for tasks like language understanding and code generation. This study explores how several recent GenAI models perform on the Clock Drawing Test (CDT), a neuropsychological assessment of visuospatial planning and organization. While models create clock-like drawings, they struggle with accurate time representation, showing deficits similar to mild-severe cognitive impairment (Wechsler, 2009). Errors include numerical sequencing issues, incorrect clock times, and irrelevant additions, despite accurate rendering of clock features. Only GPT 4 Turbo and Gemini Pro 1.5 produced the correct time, scoring like healthy individuals (4/4). A follow-up clock-reading test revealed only Sonnet 3.5 succeeded, suggesting drawing deficits stem from difficulty with numerical concepts. These findings may reflect weaknesses in visual-spatial understanding, working memory, or calculation, highlighting strengths in learned knowledge but weaknesses in reasoning. Comparing human and machine performance is crucial for understanding AI's cognitive capabilities and guiding development toward human-like cognitive functions.
What if Dr. Martin Luther King, Jr. had Access to Machine Learning?
As we honor the memory of Dr. Martin Luther King Jr., I want to reflect on all the positive things that have stemmed from this great man sharing his visions and dreams to make the world a better place. What if Dr. King had some of the modern marvels that we use daily to communicate, entertain, and share our personal views with our community and the world: Would things have been different? While having lunch with my friend and fellow #techie Rishabh Sharma, CEO of Poletus--which uses state of art machine learning as a central nervous system to its product offering--we began to run the mental algorithmic reasons of what if. The first thing that came to mind is what if Dr. King had access to machine learning. Now, I know that some think that machine learning is something mythical, but it's not.
Patterns of Word Usage in Expert Tutoring Sessions: Verbosity versus Quality
D' (University of Memphis) | Mello, Sidney
It is widely acknowledged that one-on-one human tutoring is one of the most effective ways to provide learning, however, the source of its effectiveness is still unclear. Tutor-centered, student-centered, and interaction hypotheses have been proposed as possible explanations of the effectiveness of human tutoring. Most research has addressed this question by analyzing tutorial sessions at the dialogue move or speech act level. The present paper adopts a different approach by focusing on word usage patterns in 50 naturalistic tutorial sessions between human students and expert tutors. Specifically, each unique word in the session was designated as a student initiative word, a tutor initiative word, or a shared-initiative word. Comparisons of the frequencies as well as the weights of the words assigned to each of these categories indicated that the student and tutor share initiative even though the tutor’s are considerably more verbose. The implications of the results for the development of an ITS that aspires to model expert tutors are discussed.